Back

Human Genomics

Springer Science and Business Media LLC

Preprints posted in the last 90 days, ranked by how well they match Human Genomics's content profile, based on 21 papers previously published here. The average preprint has a 0.01% match score for this journal, so anything above that is already an above-average fit.

1
Methicillin-Susceptible Staphylococcus aureus ST398 in atopic dermatitis in Portugal displays pathogenic traits associated with impaired skin barrier function

Caieiro, D.; Faria, N. A.; Botelho, A.; Araujo, M.; Ramos, L.; Calvao, J.; Goncalo, M.; Miragaia, M.

2026-02-18 dermatology 10.64898/2026.02.17.26346495 medRxiv
Top 0.1%
8.4%
Show abstract

Staphylococcus aureus plays a central role in the exacerbation of atopic dermatitis (AD), but the population structure and pathogenic determinants of strains colonizing AD patients remain poorly understood. It is unclear whether these strains mirror those circulating in the general community or whether specific clonal lineages are selectively adapted to the AD skin microenvironment. Data addressing this question are scarce, particularly in Portugal. In this study, we investigated the molecular epidemiology and pathogenic traits of S. aureus colonizing skin lesions in adult patients with AD in Portugal. We found that lesion-associated isolates belonged predominantly to the methicillin-susceptible S. aureus MSSA-ST398 clonal type, a lineage that is widely circulating in the Portuguese community, particularly among vulnerable populations, and that has also been implicated in severe human infections. Notably, isolates from this clonal type in AD harboured specific pathogenicity traits associated with skin barrier disruption, including hemolysin and urease production, which may contribute to their success as colonizers in AD. Our findings highlight that S. aureus colonization in AD arises from a dynamic interplay between community-level molecular epidemiology and disease-specific selective pressures. While circulating lineages provide the genetic background diversity, the AD skin microenvironment appears to shape which clones ultimately become dominant. Such an integrated perspective may help to inform future geographically tailored strategies aimed at limiting bacterial burden and preventing disease exacerbation in AD.

2
A Denisovan-derived Alu insertion in OCA2 contributes to pigmentation diversity in present-day Melanesians

Kim, K.; Pfennig, A.; Syed, S. A.; Moskwa, N.; Oliveira, N. A. J.; Pham, Q.-M.; Hallast, P.; Yilmaz, F.; McDonough, J.; Norton, H. L.; Akey, J. M.; Lee, C.

2026-03-18 genomics 10.64898/2026.03.18.712481 medRxiv
Top 0.1%
3.9%
Show abstract

Modern humans inherited DNA from Neanderthals and Denisovans, but the contribution of introgressed structural variants (SVs) to present-day human phenotypes and adaptation remains poorly understood. Here, we used a graph-genome approach to genotype 96,277 SVs in 3,332 present-day humans and three high-coverage archaic hominin genomes, identifying 153 candidate introgressed SVs. These SVs are enriched for signatures of local adaptation compared to non-introgressed SVs (p-value = 3.04 x 10-7). Among these, we focused on a Denisovan-derived Alu insertion located in intron 18 of OCA2, a gene central to pigmentation. This introgressed Alu insertion is most frequently observed (> 60%) in Indigenous people from Bougainville Island of Melanesia, and is significantly associated with increased skin pigmentation in this region. To assess its functional impact, the Alu insertion was introduced into human induced pluripotent stem cells (iPSCs), which were subsequently differentiated into melanocytes. Melanocytes harboring the Alu insertion demonstrated elevated OCA2 expression, increased pigmentation, and higher levels of enhancer activity compared to controls. Collectively, these findings highlight introgressed SVs as a significant source of adaptive and phenotypic diversity in modern humans and implicate the Denisovan-derived Alu insertion in OCA2 in pigmentation variation among present-day Melanesian populations.

3
A pilot genome-wide association study of ischemic heart disease with co-occurring arterial hypertension in a Kazakh cohort

Skvortsova, L.; Yergali, K.; Zhaxylykova, A.; Begmanova, M.; Mansharipova, A.

2026-03-23 genetic and genomic medicine 10.64898/2026.03.19.26348868 medRxiv
Top 0.1%
3.3%
Show abstract

Genome-wide association studies (GWAS) of ischemic heart disease (IHD) remain underrepresented in Central Asian populations. We conducted a pilot GWAS of IHD with co-occurring arterial hypertension in a Kazakh cohort to identify candidate loci for future replication. A case-control GWAS was performed in 451 individuals (236 cases and 215 controls). Genotyping was conducted using the Illumina Infinium Global Screening Array-24 v3.0. Association testing was performed using a logistic regression under an additive genetic model adjusted for age, sex and the first ten principal components (PC1 - PC10). Multiple testing correction was applied using the Bonferroni adjustment. As an additional analysis, knowledge-guided GWAS (KGWAS) followed by MAGMA gene-based testing was used to prioritize candidate genes. After quality control, 345 371 variants were tested. Two loci surpassed the Bonferroni-corrected genome-wide significance threshold: rs28898595 at the UGT1A locus (effect allele C; OR = 0.33, 95% CI = 0.23 - 0.49; p = 3.01x10-8) and rs28709059 in the intron region of the ACTR3C gene (effect allele C; OR = 0.4, 95% CI = 0.29 - 0.55; p = 4.08x10-8). Several additional loci showed suggestive evidence of association. In gene-level analysis, the CSMD1 gene demonstrated a significant association signal in MAGMA consistent with the European (p = 1.16x10-11) and East Asian (p = 9.07x10-11) LD reference panels. This pilot study identifies genome-wide significant loci (UGT1A, ACTR3C genes) and supports CSMD1 gene as a prioritized candidate gene for the complex phenotype of IHD associated with co-occurring arterial hypertension in the Kazakh cohort. These findings are preliminary and require replication in larger Central Asian cohorts and further functional validation.

4
Hair follicle-derived epithelial sheet has potential in vitiligo treatment

Li, J.; Chen, J.; Ling, L.; Tan, Z. L.; Sun, T.; Lin, J.; Chen, S.; Uyama, T.; Zhang, Q.; Liu, Q.; Wu, F.; Wu, W.

2026-03-30 dermatology 10.64898/2026.03.24.26349027 medRxiv
Top 0.1%
2.1%
Show abstract

Vitiligo is an acquired pigmentary disorder of the skin and mucus membranes. Previous study has demonstrated that autologous cultured epithelial grafts (ACEG) is an effective treatment for stable vitiligo. However, extraction of full-thickness skin might result in scar formation at donor site, which have hindered the wider application of this technology, especially for patients requiring large-area transplantation. Hair follicle as a source of keratinocyte and melanocyte, could be potential source of cells for preparation of autologous cultured sheet. Through culture system optimization, we have demonstrated maintenance of undifferentiated hair follicle-derived cells in feeder-independent culture system. After expansion, the hair follicle cells were directed to differentiate into a multi-layered, epidermis-like sheet. Cell identity, viability, purity, genomic stability, and antiseptic testing for hair follicle-derived epithelial sheet (HFES) were evaluated to ensure its safety. Immunofluorescence staining showed that basal keratinocytes were the main cell type of the autologous HFES. Optimization of culture conditions leads to increased melanocyte proliferation and functionality. Transcriptomic analysis confirmed upregulation of melanosome maturation genes. The proportions of cells are also similar to composition of cells under physiological conditions. Transplantation of HFES to depigmented areas in patients with stable vitiligo results in skin repigmentation. This technology provides a novel therapeutic option for vitiligo management.

5
Melanocyte loss dominates the vitiligo transcriptome: a rank-based meta-analysis

Ge, X.

2026-03-12 dermatology 10.64898/2026.02.07.26345817 medRxiv
Top 0.1%
2.0%
Show abstract

Vitiligo is an autoimmune disorder characterized by melanocyte destruction. We performed a rank-based meta-analysis of six independent transcriptomic studies (115 samples) spanning microarray, bulk and single-cell RNA-seq platforms to identify consensus signatures of lesional skin. Robust Rank Aggregation identified 114 differentially expressed genes (FDR < 0.05) with striking asymmetry: 108 downregulated versus 6 upregulated. Downregulated genes were dominated by melanocyte markers (MLANA, TYRP1, DCT, PMEL, KIT). Upregulated genes included interferon-stimulated genes (OAS1, OAS2, EPSTI1). Pathway-level meta-analysis confirmed uniform suppression of melanogenesis, while immune activation was heterogeneous across datasets. Single-cell data from three included studies confirmed melanocyte depletion. The 108 downregulated genes showed exclusive expression in melanocytes. These include neural genes (PLP1, GPM6B, NRXN3), consistent with melanocytes neural crest origin. We also identified candidate melanocyte markers such as CYB561A3 and QPCT with high melanocyte specificity and consistent downregulation in vitiligo. These findings reveal a robust melanocyte loss signature in vitiligo detectable across all platforms, and study-dependent immune activation possibly influenced by sampling method and disease characteristics.

6
Single-cell Landscape of T Cell Heterogeneity in Kawasaki Disease: STAT3/JAK Axis Regulates the Lineage Differentiation Bias of Th17 Cells

Song, S.; Zong, Y.; Xu, Y.; Chen, L.; Zhou, Y.; Chen, L.; Li, G.; Xiao, T.; Huang, M.

2026-03-23 bioinformatics 10.64898/2026.03.18.712795 medRxiv
Top 0.1%
1.9%
Show abstract

BackgroundKawasaki disease (KD) is a pediatric systemic vasculitis in which T-cell-mediated immune responses play a pivotal role. However, the precise dynamic evolution of T-cell subsets during disease progression remains poorly understood. MethodsSingle-cell RNA sequencing (scRNA-seq) was employed to perform high-resolution annotation of peripheral blood mononuclear cells (PBMCs) from healthy controls and KD patients, both pre- and post- IVIG treatment. T-cell developmental trajectories were reconstructed via Monocle3-based pseudotime analysis. Furthermore, the functional significance of the significant pathway was validated in a CAWS-induced KD murine model. ResultsA high-resolution single-cell landscape identified 13 distinct T-cell subtypes. Pseudotime analysis revealed a significant lineage commitment of CD4+ T cells toward a Th17 phenotype during the acute phase of KD, synchronized with the transcriptional upregulation of the STAT3/JAK signaling axis. Animal experiments further demonstrated that pharmacological inhibition of this pathway substantially attenuated inflammatory infiltration in the cardiac vasculature of KD mice. ConclusionThis study identifies the STAT3/JAK-mediated Th17 differentiation bias as a potential regulatory program associated with acute inflammation in Kawasaki disease, thereby highlighting the STAT3/JAK axis as a potential therapeutic target.

7
Copy Number Analysis in Congenital Nevi: Concordance and Diagnostic Limitations of aCGH, sWGS, and Methylation Profiling

Karelin, A.; Brecht, I. B.; Pogoda, M.; Demidov, G.; Abele, M.; Schneider, D. T.; Aldea, D.; Etchevers, H. C.; Puig, S.; Hahn, M.; Forchhammer, S.

2026-03-03 dermatology 10.64898/2026.03.03.26347388 medRxiv
Top 0.1%
1.9%
Show abstract

BackgroundDistinguishing benign proliferative nodules (PNs) from melanoma arising within congenital melanocytic nevi remains a major diagnostic challenge. Copy number alteration (CNA) analysis is widely used to support classification, but current criteria were developed using array comparative genomic hybridization (aCGH). The performance of alternative platforms such as shallow whole-genome sequencing (sWGS) and methylation arrays in this setting is poorly defined. ObjectivesThe objective of this study is to compare CNA profiles obtained from aCGH, sWGS, and methylation arrays in atypical nodules arising within congenital nevi, and to correlate these molecular findings with clinical outcomes. MethodsSixteen samples from fourteen patients were retrospectively analyzed using all three platforms. CNAs were cataloged, concordance across methods was quantified using the Jaccard index, and molecular classifications were compared. Clinical follow-up was reviewed to provide clinical context. ResultsaCGH detected 39 CNAs, sWGS 60, and methylation profiling 66. Concordance was highest between sWGS and methylation (mean Jaccard 0.67), followed by aCGH versus sWGS (0.64) and aCGH versus methylation (0.49). Cases with high aneuploidy demonstrated strong cross-platform agreement, whereas low-burden lesions exhibited greater variability between methods. Divergent molecular classifications were observed in six cases. ConclusionsWhile all methods reliably detect broad chromosomal changes, sWGS and methylation arrays identify many additional focal CNAs that may not align with CGH-based diagnostic criteria. Until platform-specific thresholds are established, aCGH remains the most conservative and clinically validated approach for evaluating proliferative nodules in congenital nevi. SIGNIFICANCEAccurate molecular classification of melanocytic proliferations in congenital nevi is essential but challenging, particularly in patients with multiple proliferative nodules. This study provides the first systematic comparison of aCGH, sWGS, and methylation-based CNA profiling in this setting. We show that higher-resolution platforms detect substantially more focal aberrations, which can lead to discordant and potentially overcalled malignancy assessments when applying CGH-derived criteria. Our findings highlight the need for platform-adapted diagnostic frameworks and support continued use of CGH as the most conservative and clinically validated method for risk stratification. GRAPHICAL ABSTRACT O_FIG O_LINKSMALLFIG WIDTH=118 HEIGHT=200 SRC="FIGDIR/small/26347388v1_ufig1.gif" ALT="Figure 1"> View larger version (27K): org.highwire.dtl.DTLVardef@1d7b155org.highwire.dtl.DTLVardef@1bb7081org.highwire.dtl.DTLVardef@d72e3forg.highwire.dtl.DTLVardef@11d3f0b_HPS_FORMAT_FIGEXP M_FIG C_FIG

8
Resolution of systemic inflammation in psoriasis following herring roe oil treatment: a post hoc analysis on inflammatory biomarkers in non-severe psoriatic patients

Ringheim-Bakka, T. A.; Gammelsaeter, R.; Tveit, K. S.

2026-04-22 dermatology 10.64898/2026.04.20.26350934 medRxiv
Top 0.1%
1.8%
Show abstract

BackgroundPsoriasis is a chronic immune-mediated inflammatory disease (IMID) with systemic involvement. In mild-to-moderate disease, circulating cytokines may inadequately capture systemic inflammatory burden. Composite haematological indices derived from complete blood counts, such as the systemic immune-inflammation index (SII) and systemic inflammation response index (SIRI), have emerged as sensitive prognostic markers of systemic inflammation, including in psoriasis. This exploratory post hoc analysis investigated the effects of orally administered herring roe oil (HRO), a phospholipid-rich marine oil, on systemic inflammation in patients with mild-to-moderate psoriasis utilizing these biomarkers. MethodsData were analysed from a randomized, double-blind, placebo-controlled 26-week clinical study which investigated HRO supplementation in patients (N = 64) with mild-to-moderate psoriasis (NCT03359577). SII, SIRI, neutrophil-to-lymphocyte ratio (NLR), platelet-to-lymphocyte ratio (PLR), and monocyte-to-lymphocyte ratio (MLR) were calculated at baseline, week 12, and week 26 for patients where baseline complete blood counts (CBCs) were available (n = 60). Patients missing baseline CBCs were excluded from the analysis. Continuous changes were assessed using ANCOVA with baseline adjustment. Categorical responder analyses were performed with 25% and 30% reduction thresholds and stratification by baseline biomarker medians were performed to evaluate treatment responses and impact of baseline inflammation. ResultsCompared with placebo, HRO treatment resulted in significant mean reductions in SII, SIRI, and PLR at week 26, with supportive trends and responder effects observed as early as week 12 compared to placebo. Patients with elevated baseline inflammatory indices showed the greatest reductions in systemic inflammation. Stratification by baseline SII further revealed enhanced clinical benefit, with statistically significant PASI50 response rates in the HRO arm at week 26 among patients with lower baseline SII. ConclusionHRO supplementation was associated with a time{square}dependent reduction in systemic inflammatory biomarkers in mild{square}to{square}moderate psoriasis patients. These findings support the utility of composite inflammatory indices for monitoring systemic inflammation and suggest that baseline SII may have utility in predicting treatment response and may be a useful tool for stratification in clinical trials in mild to moderate psoriasis patients. These results could also suggest platform-potential of HRO for resolution{square}oriented interventions across several inflammatory conditions.

9
Genetic loss of JAK1 and cutaneous HPV infection

Fan, S.-Q.; Wang, R.-R.; Colombo, R.; Tang, K.-C.; Liu, J.-W.; Pontoglio, A.; Zhang, L.-L.; Li, K.; Han, S.-R.; Zhang, H.; Bai, X.; Yu, X.; Habulieti, X.; Liu, K.-Q.; Sun, Y.; Sun, L.-W.; Liu, H.; Sun, M.; Lin, Z.-M.; Zhang, F.-R.; Ma, D.-L.; Zhang, X.

2026-04-08 genetic and genomic medicine 10.64898/2026.04.03.26350014 medRxiv
Top 0.1%
1.7%
Show abstract

Background: Human papillomaviruses (HPVs) pose a severe threat to global public health by driving nonmelanoma skin cancer (NMSC) and cervical cancer, with NMSC being one of the most common cancers worldwide. Epidermodysplasia verruciformis (EV) is an inborn error of immunity characterized by an increased susceptibility to persistent infection of cutaneous HPV and a high risk of NMSC. The genetic basis remains unknown in many patients with EV. Methods: We collected four unrelated pedigrees with EV. Genetic analysis identified five variants in JAK1 encoding the Janus kinase 1. Ex vivo models and patient-derived tissue were employed to evaluate the functional effects of JAK1 variants and delineate the pathogenic mechanisms. Results: We identified different variants in JAK1 in four pedigrees with dominant EV. Genetic analysis revealed five novel variants in JAK1, three of which resulted in nonsense-mediated mRNA decay (NMD). Functional assays identified a decreased phosphorylation of the signal transducers and activators of transcription (STATs), impaired interferon responses, and defective T cell activation. Immune dysregulation in patients, characterized by a reduced CD4/CD8 T cell ratio, decreased CD8 naive T cell proportion, and accumulated memory T cells, implies impaired antiviral immunity against HPV. Conclusions: Our findings confirm that JAK1 loss-of-function (LOF) variants underlie susceptibility to cutaneous HPV infection. [Funded by the National Natural Science Foundation of China (81788101, 81230015, 82394420, and 82394423), the National Key Research and Development Program of China (2022YFC2703900), the CAMS Innovation Fund for Medical Sciences (2021-I2M-1-018), and the Regione Lombardia, Italy (Innovative Research Project 1137-2010)].

10
Grading of Erythema and Visual Attributes in Atopic Dermatitis across Diverse Skin Tones Using a Vision AI Pipeline

Abdolahnejad, M.; Kyremeh, M.; Smith, J.; Fang, G.; Chan, H. O.; Joshi, R.; Hong, C.

2026-03-31 dermatology 10.64898/2026.03.30.26349755 medRxiv
Top 0.1%
1.7%
Show abstract

Background: Atopic dermatitis (AD) is a prevalent chronic inflammatory skin disease associated with clinical, psychosocial, and economic burden. Accurate severity assessment is essential for guiding treatment escalation and monitoring disease activity, yet clinician-based scoring systems such as the Eczema Area and Severity Index (EASI) are limited by subjectivity and considerable inter- and intra-rater variability. Erythema, a key driver of AD severity grading, is particularly prone to inconsistent evaluation due to differences in ambient lighting, device quality, skin tone, and rater experience, underscoring the need for objective, reproducible assessment tools. Objective: To develop and validate an artificial intelligence (AI) pipeline for grading erythema, excoriation, and lichenification severity in AD from clinical photographs. The study evaluated the level of agreement between AI severity ratings in each category against dermatologists, non-specialists, and a consensus reference standard, with erythema as the primary outcome of interest. Methods: A two-stage AI pipeline was developed using EfficientNet B7 convolutional neural networks (CNNs). The first CNN was trained as a binary AD classifier on 451 AD and 601 non-AD images for lesion detection and segmentation. The second CNN was trained on 173 dermatologist-annotated AD images which were scored on a 0-3 ordinal scale for erythema, excoriation, and lichenification. This CNN had a downstream feature extraction algorithms such red channel contrast for erythema, Law's E5L5 for excoriation, and S5L5 texture maps for lichenification. In a cross-sectional validation study, 41 independent test images were scored by two blinded dermatologists and two blinded physicians. AI predictions were compared to individual rater groups and mode-derived consensus scores using weighted Cohen's kappa, classification accuracy, confusion matrices, and error direction analyses. Results: On internal validation, the severity CNN achieved 84% overall accuracy (averaged across all three attributes), 86% sensitivity, 87% specificity, and a macro-averaged area under the receiver operating characteristic curve (AUC) of 0.90. In the external comparison with blinded human raters, erythema agreement between the AI and dermatologist consensus was substantial (accuracy 80.7%; kappa = 0.68), with no large (>2-point) misclassifications. Physician consensus agreement was lower (accuracy 54.8%; kappa = 0.34), reflecting greater variability among primary care physicians (non-specialists). For excoriation, AI-dermatologist agreement was moderate (accuracy 72.4%; kappa = 0.62); for lichenification, agreement was similar (accuracy 71.4%; kappa = 0.59). Across all features, disagreements were predominantly between adjacent severity categories. The AI was able to generate erythema severity grades for images of darker skin tones that dermatologists typically would not rate and were marked as "unable to assess". Limitations: The validation set was small (41 images), severe cases (score 3) were underrepresented, one rater participated in both training annotation and validation scoring, and sample size was insufficient for robust stratification by skin tone or body site. Conclusion: The AI pipeline demonstrated dermatologist-level accuracy for erythema scoring, consistent moderate agreement for excoriation and lichenification, and a potential advantage in assessing erythema on darker skin tones. These findings support its potential as a standardized, objective tool for AD severity assessment. Prospective validation in larger, more diverse cohorts is warranted.

11
Immune Transcriptional Signatures Across Human Cardiomyopathy Subtypes: A Multi-Cohort Integrative Computational Analysis

Adegboyega, B. B.; Okorie, B.; Courage, P.

2026-03-13 bioinformatics 10.64898/2026.03.10.710912 medRxiv
Top 0.1%
1.5%
Show abstract

BackgroundHeart failure, arrhythmia, and sudden cardiac death are common outcomes of cardiomyopathies, which are molecularly diverse heart muscle disorders marked by structural and functional myocardial dysfunction. The lack of sensitive molecular biomarkers that precede overt physiological deterioration makes early diagnosis difficult despite advancements in imaging and clinical classification. The immune transcriptional landscape across cardiomyopathy subtypes is still poorly understood, despite growing evidence linking both innate and adaptive immune dysregulation, such as macrophage activation and T-cell and inflammatory cytokine networks, as active contributors to myocardial remodelling and disease progression. MethodsWe performed a multi-cohort integrative transcriptomic analysis of 1,068 cardiac tissue samples from five publicly available GEO datasets (GSE57338, GSE5406, GSE36961, GSE141910, GSE47495) spanning dilated, ischemic, hypertrophic, and peripartum cardiomyopathy. Using a fully scripted R and Python pipeline, we conducted differential expression analysis (limma), immune cell deconvolution (xCell), pathway enrichment (clusterProfiler), weighted gene co-expression network analysis (WGCNA), and regularised machine learning classification (LASSO, Random Forest). Cross-dataset validation was performed in two independent cohorts on different microarray platforms. ResultsDifferential expression analysis identified 43 primary DEGs (FDR < 0.05, |log2FC| > 1.0), revealing a coherent immune-fibrotic program characterized by loss of anti-inflammatory macrophage markers (CD163, VSIG4), complement dysregulation (FCN3), innate interferon activation (IFI44L, IFIT2), and ECM remodelling (ASPN, SFRP4, LUM). xCell deconvolution identified coordinated depletion of adaptive immune populations in failing myocardium. WGCNA defined a fibrosis hub module (brown; CTSK, SULF1, SFRP4) and an immune collapse module (turquoise; MYD88, TNFRSF1A, LAPTM5). A nine-gene LASSO classifier achieved a cross-validated AUC of 0.986, with HMOX2 as the top-discriminating feature, implicating ferroptosis in cardiomyocyte death. Cross-platform validation in an independent HCM cohort (GSE36961) demonstrated a directional concordance of 34.9%. ConclusionsThis study defines a reproducible immune-fibrotic transcriptional signature of human cardiomyopathy, nominates HMOX2 and ferroptosis as central pathomechanisms, and provides a validated nine-gene biomarker panel for future translational investigation.

12
A meta-analysis of clinically ascertained lipoedema cohorts from the UK and Spain identifies overlapping susceptibility loci with the UK Biobank

Dobbins, S. E.; Forner-Cordero, I.; Amigo Moreno, R.; Southgate, L.; Hobbs, K.; Moy, R.; Adjei, M.; Muntane, G.; Vilella, E.; Martorell, L.; Gordon, K.; Ostergaard, P. E.; Pittman, A.

2026-02-12 genetic and genomic medicine 10.64898/2026.02.11.26345915 medRxiv
Top 0.1%
1.5%
Show abstract

Lipoedema is a chronic adipose tissue disorder mainly affecting women with excess subcutaneous fat deposition on the lower limbs, associated with pain and tenderness. There is often a family history of lipoedema, suggesting a genetic origin, but the contribution of genetics is not well studied. We conducted a genome-wide association study (GWAS) for this disorder in a clinically ascertained cohort from Spain and performed a meta-analysis with the UK lipoedema cohort GWAS. We then used the results of this study as a replication of the inferred UK Biobank "lipoedema phenotype" study. Whilst our meta-analysis alone did not identify any genome-wide significant associations, our clinical cohorts provide support for three loci identified through the UKBB study: the chr2q24.3 GRB14-COBLL1 locus (rs6753142, PMETA=1.64x10-6), chr6p21.1 VEGFA locus (rs4711750, PMETA=8.99x10-7) and the chr5q11.2 ANKRD55-MAP3K1 locus (rs3936510, PMETA=1.67x10-5). We identify numerous rare SNPs with strong association signals in our meta-analysis (P<1x10-6) with support in both UK and Spanish datasets, three of which also show nominal support in the UKBB (P<0.05). These findings provide a starting point towards understanding the genetic basis of clinical lipoedema and demonstrate the utility of the interplay of large-scale biobanks genetic data and clinically ascertained cohorts to elucidate the genetic architecture of lipoedema.

13
An AI-Integrated Framework for Precision Genomics in Coronary Artery Disease Using Whole Exome and Phenotypic Data

UPPALURI, K. R.; CHALLA, H. J.; VEMPATI, K. K.; KADALI, L. N.; PALASAMUDRAM, K.; RAYALA, M.

2026-01-30 genetic and genomic medicine 10.64898/2026.01.28.26345099 medRxiv
Top 0.1%
1.5%
Show abstract

Coronary artery disease (CAD) is a multifactorial condition influenced by genetic, phenotypic, and environmental factors. Traditional risk prediction models fall short in capturing the polygenic complexity of CAD, particularly in underrepresented populations. This study presents SIGMA (Scoring Importance of Genes specific to disease using Machine learning Algorithms), a novel AI-powered framework that enhances CAD risk prediction by integrating genomic and phenotypic data. Our approach leverages GEMS (GeneConnectRx Evidence Metrics), an LLM-driven system to score 1772 CAD-associated genes, and CASCADE (Comprehensive Assessment of Sequence and Clinical Annotation Data Evaluation), a tiered variant scoring pipeline. Using whole exome sequencing (WES) data from 1,243 individuals (628 controls, 615 CAD cases), the model integrates age and gender as key non-modifiable phenotypes. Results show significant improvements in sensitivity (from 0.41 to 0.79), specificity (0.70 to 0.72), and AUC (0.59 to 0.81) when phenotype data are incorporated. Our findings highlight the potential of AI-integrated genomics for population-specific CAD risk stratification.

14
Short tandem repeats significantly contribute to the genetic architecture of metabolic and sensory age-related hearing loss phenotypes

Ahmed, S.; Vaden, K. I.; Dubno, J. R.; Wright, G.; Drogemoller, B.

2026-02-18 genetic and genomic medicine 10.64898/2026.02.17.26346449 medRxiv
Top 0.1%
1.5%
Show abstract

Age-related hearing loss (ARHL) is a progressive, bilateral decline in hearing ability that affects one in four individuals over 60 years of age worldwide. While previous genome-wide association studies (GWAS) have identified distinct single-nucleotide variants (SNVs) associated with metabolic and sensory ARHL phenotypes, the contribution of short tandem repeats (STRs) - a neglected yet important class of genetic variants - remains poorly understood. To address this gap, TRTools was used to impute STRs from a high quality, sequencing-derived SNV-STR reference panel to investigate the association between STRs and metabolic and sensory estimates. Heritability analyses revealed that while STRs contribute to estimates of both ARHL components, this class of variation plays a more important role in metabolic hearing loss (6%), which typically increases with age, compared to sensory hearing loss (4%). Further, the inclusion of this class of variant into GWAS analyses uncovered an association between a haplotype consisting of two missense variants (rs7714670 and rs6453022) and an intronic STR (chr5:73778077:A16) in ARHGEF28 (P=3.30x10-9), proving further insight into the variants driving this previously identified signal. Notably, burden analyses revealed that rare and longer repeats were associated with an increased risk of the metabolic phenotype and a reduced risk of the sensory phenotype. Functional annotation of significant and nominally significant STRs revealed potential effects on gene expression and splicing of nearby genes. Our findings provide the first evidence that STRs explain some of the missing heritability of ARHL phenotypes and create an STR resource for researchers to use in future analyses.

15
T2T-CHM13 reference genome reduces mapping bias and enhances alignment accuracy at disease-associated variants

Cherchi, I.; Orlando, F.; Quaini, O.; Paoli, M.; Ciani, Y.; Demichelis, F.

2026-02-10 genomics 10.64898/2025.12.17.694618 medRxiv
Top 0.2%
1.4%
Show abstract

1The T2T-CHM13v2.0 reference genome added previously uncharacterized genomic sequences and improved the accuracy of repetitive stretches compared to former human genome assemblies. By comprehensive allelic variation analysis and read mapping statistics from sequencing reads aligned to hg38 and T2T-CHM13 assemblies in samples encompassing different sequencing designs and ethnicity groups, we observed that T2T-CHM13v2.0 assembly significantly reduces the reference mapping bias (RMB) and increases read mapping precision at clinically relevant sites, including BRCA1 pathogenic variants. Further, we report the presence of sequence dissimilarities among reference genomes in the proximity of ClinVar annotated variants, suggesting the need for data re-analysis and potential redesign of probes targeting clinically relevant regions. Overall, these findings support the implementation of T2T-CHM13 reference for the improvement of sequencing data analyses in the clinical genomic setting.

16
Genome-wide association study of corneal dystrophy uncovers novel risk loci and enables improved polygenic prediction of Fuchs endothelial corneal dystrophy

Insawang, B.; Mackey, D. A.; Hewitt, A. W.; Craig, J. E.; Mills, R.; Gharahkhani, P.; MacGregor, S.

2026-02-15 genetic and genomic medicine 10.64898/2026.02.10.26345409 medRxiv
Top 0.2%
1.3%
Show abstract

ObjectiveTo identify risk loci for Fuchs endothelial corneal dystrophy (FECD) and improve a genetic risk prediction model. DesignGenome-wide association study (GWAS), polygenic risk score (PRS) construction, and TCF4 CTG18.1 short tandem repeat (STR) length inference. ParticipantsThe study included 7,316 Europeans (EUR) with FECD or related corneal dystrophy phenotypes and 1,588,467 controls from the UK Biobank, All of Us, FinnGen, and the Million Veteran Program. Two independent EUR FECD cohorts were used for PRS validation (1,851/2,679 cases/controls and 124/257 cases/controls). African (AFR) ancestry analyses included 455 cases and 121,154 controls to build PRS. A subset of All of Us participants was used for joint PRS and STR modelling. MethodsGWAS meta-analyses were performed using FECD diagnoses or corneal dystrophy proxies where necessary, with validity assessed via genetic correlation. Risk loci were identified, and ancestry-specific PRSs were constructed using SBayesRC. PRS performance was evaluated across ancestries with and without TCF4 STR data. Main OutcomeWe identified novel loci for corneal dystrophy and constructed PRS-based and STR-based prediction models. ResultsThe GWAS meta-analysis identified 24 risk loci associated with corneal dystrophy, including 12 novel loci, doubling previous FECD studies. The optimised PRS outperformed existing models in two independent FECD validation cohorts (AUC = 0.83, 95% CI: 0.82-0.84; DeLongs P = 7.04 x 10-19), with individuals in the top PRS decile showing 14-fold and 19-fold increased risk in the two validation sets, respectively In All of Us, STR expansion (>40 repeats) was the key predictor of FECD risk, yielding excellent discrimination (AUC = 0.89; OR = 54) with minimal improvement from PRS. Consistent with this, STR expansion remained the primary driver of risk across ancestries, while PRS provided modest independent value for broader corneal dystrophy phenotypes in EUR and admixed American populations. Among participants without large STR expansion, overall predictive performance was modest; PRS was the only significant genetic contributor (OR = 1.37) for broader corneal dystrophy in Europeans, whereas analyses in FECD non-expansion carriers were underpowered. ConclusionsThese findings refine the genetic architecture of FECD, enhance risk prediction, and support a tiered strategy integrating STR expansion testing with PRS. Key PointsO_ST_ABSQuestionC_ST_ABSCan polygenic risk scores (PRS), alone or combined with TCF4 CTG18.1 short tandem repeat (STR) length, improve genetic risk prediction for Fuchs endothelial corneal dystrophy (FECD)? FindingsIn this GWAS meta-analysis of 7,316 cases and 1,588,467 controls, PRS showed strong predictive performance in validation cohorts lacking STR data. When STR length was available, it was the main predictor of FECD risk with limited additional contribution from PRS. Among non-expansion STR carriers, PRS helped stratify risk for broader corneal dystrophy in Europeans. MeaningPRS provide a practical, complementary approach for FECD risk prediction, particularly when STR data are unavailable.

17
Ancestry-stratified variant classification in monogenic diabetes genes: annotation coverage and differential curation burden

Dario, P.

2026-04-07 genetic and genomic medicine 10.64898/2026.04.06.26350230 medRxiv
Top 0.2%
1.3%
Show abstract

Variant databases ClinVar and gnomAD are the backbone of clinical variant interpretation, but their population composition is skewed toward European ancestry. Whether this skew creates systematic classification disadvantages for non-European patients with monogenic diabetes has not been examined at the database level. ClinVar variant_summary (GRCh38, April 2026; 4,421,188 variants) was cross-referenced with gnomAD v4.0 genome data for 17 monogenic diabetes genes. Annotation coverage and variant classification rates were computed stratified by genetic ancestry group (AFR, AMR, EAS, SAS, MID, NFE, FIN, ASJ). Of 14,691 gnomAD variants across the 17 genes, only 29.7% had any ClinVar classification (range: 12.7%-61.3% by gene). Among classified variants, non-Finnish European (NFE) variants had the highest variant of uncertain significance (VUS) rate (32.1%) and the lowest benign/likely benign fraction (41.6%), consistent with a large submission volume without functional follow-up. African-ancestry (AFR) variants showed the second-highest VUS rate (29.2%), not statistically distinguishable from NFE after Bonferroni correction, while all other non-European groups had significantly lower rates (all p < 0.001). GCK showed a pattern inversion - non-European VUS rate (18.5%) exceeding European (15.0%) - consistent with progressive reclassification in European populations absent in non-European cohorts. Annotation coverage and VUS divergence were uncorrelated (r = -0.15, p = 0.57). The primary equity problem is a 70% annotation gap combined with a non-European curation deficit, not a simple VUS excess. Ancestry-stratified evaluation of ClinGen Variant Curation Expert Panel (VCEP) criteria performance is warranted across disease domains.

18
Enhanced Hi-C Capture Analysis reveals complex regulatory architecture at the PICALM-EED locus for Alzheimer Disease

Nasciben, L. B.; Wang, l.; Xu, W.; Ramirez, A.; Moura, S.; Lu, L.; Liu, X.; Rajabli, F.; Celis, K.; Gearing, M.; Bennett, D.; Weintraub, S.; Geula, C.; Schuck, T.; Nuytemans, K.; Scott, W.; Dykxhoorn, D.; PERICAK-VANCE, M. A.; Young, J.; Griswold, A.; Jin, F.; Vance, J. M.

2026-02-17 genomics 10.64898/2026.02.14.705927 medRxiv
Top 0.2%
1.3%
Show abstract

ObjectiveBoth the phosphatidylinositol binding clathrin assembly protein gene (PICALM) and the embryonic ectoderm development gene (EED) have been implicated as causal genes driving a genome-wide association for Alzheimer disease (AD) risk. We employed a new virtual approach using genome-wide chromatin interactions (Hi-C) called enhanced Hi-C Capture Analysis (eHiCA) to identify the genes and regulatory regions that are driving this important AD risk association. MethodsHi-C data from the frontal cortex of eight AD patients, as well as inducible pluripotent stem cell-derived microglia and spheroids of AD and control patients were used. We applied 14 eHiCA baits each containing a GWAS SNP to identify the cis regulatory interactions in this GWAS locus at a 5kb resolution. ResultsThe baits derived from the GWAS associated haplotype primarily interacted with the PICALM promoter and the large cis-regulatory elements cluster (CREe) lying upstream of the EED promoter. The EED promoter interacts with PICALM gene body and promoter region but not directly with the associated risk haplotype. Although the AD-associated variants segregate together as a haplotype in the population, each bait exhibited distinct functional chromatin interactions. InterpretationThe PICALM gene is the primary driver of the association in microglia along with the CREe locus. Different SNPs in a segregating haplotype can display different physical Hi-C interactions. This study demonstrates that eHiCA can help resolve the casual genes driving complex GWAS associations, opening new pathways to study Alzheimer disease and other disorders.

19
Distinct cochlear cell types associated with genetic susceptibility to sensory and metabolic hearing loss in older adults from the CLSA

Ahmed, S.; Vaden, K. I.; Dubno, J. R.; Drogemoller, B. I.

2026-02-18 genomics 10.64898/2026.02.17.706270 medRxiv
Top 0.2%
1.3%
Show abstract

Hearing loss is a heterogeneous condition that can be classified into different subtypes with diverse genetic and cellular components. To investigate the cochlear cell types underlying the genetic basis of sensory and metabolic components of age-related hearing loss (ARHL), we integrated human genome-wide association study data with mouse cochlear single-cell RNA sequencing data using the single-cell disease relevance score tool. These analyses revealed that genes associated with the sensory component of ARHL in older humans were most highly expressed in the hair cells, while genes associated with metabolic component of ARHL in older humans were most highly expressed in spiral ganglion neurons. To assess whether age-related transcriptional changes might influence these patterns, we performed age-stratified analyses. In younger mice, sensory hearing loss-associated genes revealed significant heterogeneity in expression in supporting cells within the sensory epithelium. In contrast, the greatest heterogeneity in the expression of metabolic hearing loss-associated genes was observed in intermediate cells of the stria vascularis in older mice. These findings provide evidence for the role of distinct genetic and cellular risk profiles for different ARHL subtypes, suggesting that prevention and therapeutic strategies may require targeting specific cell populations at different life stages.

20
Ancestry-specific performance of variant effect predictors in clinical variant classification

Hoffing, R.; Zeiberg, D.; Stenton, S. L.; Mort, M.; Cooper, D. N.; Hahn, M. W.; O'Donnell-Luria, A.; Ward, L. D.; Radivojac, P.

2026-02-17 bioinformatics 10.64898/2026.02.14.705914 medRxiv
Top 0.2%
1.3%
Show abstract

Predicting the effects of genetic variants and assessing prediction performance are key computational tasks in genomic medicine. It has been shown that well-calibrated variant effect predictors can be reliably used as evidence towards establishing pathogenicity (or benignity) of missense variants, thereby rendering these variants suitable for use in (or exclusion from) the genetic diagnosis of rare Mendelian conditions. However, most predictors have been trained or calibrated on data that may not be sufficiently representative to lead to similar performance across all genetic ancestries. This raises questions about the responsible deployment of these tools to improve human health. To better understand the utility of computational predictors, we set out to assess their ancestry-specific performance in terms of accuracy and evidence strength according to the ACMG/AMP guidelines. First, we determined that the expected count of rare variants in an individuals genome and the allele frequency distribution of these variants are the key confounders when evaluating a predictors performance across different genetic ancestries. Second, we found that a predictors accuracy itself inversely correlates with the allele frequency of the rare variant. After stratifying according to allele frequency, we show that established methods for predicting the pathogenicity of missense variants have comparable performance levels across major ancestry groups. Our results therefore support the wide deployment of such models in the context of genetic diagnosis and related applications.